Characterizing Discriminative Patterns
نویسندگان
چکیده
Discriminative patterns are association patterns that occur with disproportionate frequency in some classes versus others, and have been studied under names such as emerging patterns and contrast sets. Such patterns have demonstrated considerable value for classification and subgroup discovery, but a detailed understanding of the types of interactions among items in a discriminative pattern is lacking. To address this issue, we propose to categorize discriminative patterns according to four types of item interaction: (i) driver-passenger, (ii) coherent, (iii) independent additive and (iv) synergistic beyond independent additive. The coherent, additive, and synergistic patterns are of practical importance, with the latter two representing a gain in the discriminative power of a pattern over its subsets. Synergistic patterns are most restrictive, but perhaps the most interesting since they capture a cooperative effect that is more than the sum of the effects of the individual items in the pattern. For domains such as biomedical and genetic research, differentiating among these types of patterns is critical since each yields very different biological interpretations. For general domains, the characterization provides a novel view of the nature of the discriminative patterns in a dataset, which yields insights beyond those provided by current approaches that focus mostly on pattern-based classification and subgroup discovery. This paper presents a comprehensive discussion that defines these four pattern types and investigates their properties and their relationship to one another. In addition, these ideas are explored for a variety of datasets (ten UCI datasets, one gene expression dataset and two geneticvariation datasets). The results demonstrate the existence, characteristics and statistical significance of the different types of patterns. They also illustrate how pattern characterization can provide novel insights into discriminative pattern mining and the discriminative structure of different datasets. Codes for pattern characterization and supplementary documents are available at http://vk.cs.umn.edu/CDP
منابع مشابه
POISketch: Semantic Place Labeling over User Activity Streams
Capturing place semantics is critical for enabling location-based applications. Techniques for assigning semantic labels (e.g., “bar” or “office”) to unlabeled places mainly resort to mining user activity logs by exploiting visiting patterns. However, existing approaches focus on inferring place labels with a static user activity dataset, and ignore the visiting pattern dynamics in user activit...
متن کاملAn MCMC Feature Selection Technique for Characterizing and Classifying Spatial Region Data
We focus on characterizing spatial region data when distinct classes of structural patterns are present. We propose a novel statistical approach based on a supervised framework for reducing the dimensionality of the initial feature space, selecting the most discriminative features. The method employs the statistical techniques of Bootstrapping simulation, Bayesian Inference and Markov Chain Mon...
متن کاملA Hybrid Classifier for Characterizing Motor Unit Action Potentials in Diagnosing Neuromuscular Disorders
Background: The time and frequency features of motor unit action potentials (MUAPs) extracted from electromyographic (EMG) signal provide discriminative information for diagnosis and treatment of neuromuscular disorders. However, the results of conventional automatic diagnosis methods using MUAP features is not convincing yet.Objective: The main goal in designing a MUAP characterization system ...
متن کاملDiscovery of Shifting Patterns in Sequence Classification
In this paper, we investigate the multi-variate sequence classification problem from a multi-instance learning perspective. Real-world sequential data commonly show discriminative patterns only at specific time periods. For instance, we can identify a cropland during its growing season, but it looks similar to a barren land after harvest or before planting. Besides, even within the same class, ...
متن کاملDPClass: An Effective but Concise Discriminative Patterns-Based Classification Framework
Pattern-based classification was originally proposed to improve the accuracy using selected frequent patterns, where many efforts were paid to prune a huge number of non-discriminative frequent patterns. On the other hand, tree-based models have shown strong abilities on many classification tasks since they can easily build high-order interactions between different features and also handle both...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1102.4104 شماره
صفحات -
تاریخ انتشار 2011